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Abstract — This paper presents a (semi-)automatic processing 
technique for GPR data analysis. Exploiting the generaliza- 
tion capabilities of artificial neural networks (ANN), it will be 
shown that it is possible to feed a Multi-Layer Perceptron (MLP) 
with a suitable set of input features in order to determine the 
permittivity of a ground layer. A detailed performance assess- 
ment have proven that the algorithm provides very promising 
results, reconstructing with high accuracy the dielectric prop- 
erties of both planar and rough surfaces. Some critical issues 
have anyway emerged that limit the effectiveness of the method 
to lossless media. 

Index Terms — Inverse scattering, artificial neural networks, 
GPR. 

I. Introduction 

The term non-destructive testing (NDT) denotes a wide 
group of analysis techniques used in science and industry 
to evaluate the properties of a material, component or system 
without causing damage [1]. NDT covers a broad and 
interdisciplinary range of research fields, from biomedical 
engineering to geology, providing an excellent balance 
between quality and cost-effectiveness. Atypical device used 
in NDT is the Ground Penetrating Radar (GPR), designed 
primarily for the detection of objects and/or interfaces below 
the earth's surface [2] . As a radar instrument, the GPR consists 
in a transmitting antenna (TX) which emits e.m. pulses towards 
the ground and a receiving antenna (RX) recording in a so- 
called trace the travel time and amplitude of the backscattered 
wavelets (see Fig. 1). In standard GPR measurements [3], the 
antennas are pulled along the survey track while traces are 
triggered at a fixed interval by a measurement wheel which is 
connected to the back of the antenna. This results in a series 
of traces which are finally displayed by the measurement 
software as a function of position and time in a so-called 
radargram, shown in Fig. 2. The analysis of radargrams 
requires long-term expertise, and no well-established 
techniques for an automatic processing can be still found. 
The scope of this paper is therefore to provide a (semi- 
automatic processing chain for one of the open issues in 
NDT: the inversion of GPR data for the reconstruction of the 
dielectric characteristics of a surface layer. Such information 
is of great interest to many research areas, for instance when 
soil moisture content must be evaluated [4] or for 
measurements of crop canopy properties [5] . 




Figure 1. GPR system: a transmitting antenna (TX) emits e.m. 
pulses towards the ground, while a receiving antenna (RX) records 
the travel time and amplitude of the backscattered wavelets 
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Figure 2. A radargram sample from on-field surveys (a). 
In (b) a single trace has been pointed out 

To study in the most exhaustive way the problem, it must be 
first discussed an aspect which is often disregarded, i.e. how 
a rough surface can influence the received e.m. field. Com- 
monly, solving approaches assume flat interfaces for simpli- 
fication [6]-[7]; nevertheless, it has been demonstrated that 
realistic ground models can yield significant improvements 
[8] -[9]. In practice, in fact, the flat interface approximation 
could lead to appreciable errors, as shown in Fig. 3, where 
the received signal from different roughness profiles is de- 
picted. It can be noticed that, with respect to the planar sur- 
face response in Fig. 3a, the tail of signals can either exhibit 
ripples (Fig. 3b-c) or strong attenuation (Fig. 3d), due to 
multipath reflections. It is straightforward that such a differ- 
ent response could compromise the inversion process. 
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are adapted through a recursive minimization of the error 



Figure 3. Effects of layer roughness on the received signals. 

To solve the described problem, which would require a com- 
plex formalization of the physical phenomena involved, we 
here propose a method based on artificial neural networks 
(ANN). This kind of techniques have recently gained much 
interest in inverse scattering problems thanks to their capa- 
bilities of finding a regression model that relates the e.m. data 
to the desired outputs [10]. In the next section, the algorithm 
will be shown, introducing first some background informa- 
tion and then describing the actual processing scheme. Re- 
sults and performance assessment will be presented in detail 
in Section III. Section IV concludes the discussion with final 
remarks. 

n. Methodology 

Multi-Layer Perceptrons (MLPs) are feed forward artificial 
neural networks consisting of fully connected neurons 
arranged into layers, typically an input, hidden and output 
layer (see Fig. 4). Provided only that a proper number of 
hidden units and sufficiently smooth activation functions 
are available, they are capable of approximating any 
functional relation arbitrarily well [ 1 1 ] . In particular, for a three- 
layer MLP with topology L-M-N, the output y of each output 
node is given by: 



M 



m = \ 



x,„w ml +w 
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where jc is the input vector, /represents a sigmoid function 
and w and w denote the input-hidden and hidden-output 
weights, which are internal adjustable parameters of the 
network. The knowledge of the MLP is acquired through a 
supervised learning process, where the availability of a a set 
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of ground-truthed input/output samples is assumed [12]. 
Given the training set T, the free parameters of the network 
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between the actual and the expected values of the output, 
according to the Back Propagation (BP) algorithm [13]. 
Thanks to their generalization capabilities, MLPs seem 
therefore a suitable choice for reconstructing the permittivity 
of a layer by inverting GPR data. To this end, in agreement 
with the discussion above, a proper network topology and 
set of input features must be chosen. As the only parameter 
that needs to be reconstructed is the dielectric constant of 
the surface under test, the output layer of the MLP would be 
made of a single node. As far as the input layer is concerned, 
instead, we must consider that, provided that the air wave in 
bistatic GPRs can be windowed or assuming a monostatic 
GPR, the first wavelet reaching the RX is the ground wave, 
which is exactly the signal containing the information about 
layer's permittivity. Hence, we propose to sample the 




Figure 4. Architecture of a three-layer MLP, with L input nodes, M 
hidden nodes and N output nodes. 

radargram trace at the times f corresponding to its first K 
maxima/minima P, and then feed the MLP with the vector: 

k 

The rationale behind this choice is that, on the one hand 
maxima and minima hold a higher informational content and, 
on the other hand, they can be quite simply detected through 
a peak detector. There are no well-established criteria to 
estimate which and how many input features, as well as the 
number of units within the hidden layer, should be employed 
to carry out the desired results. A rule of thumb states that, to 
ensure an adequate accuracy level, the appropriate number 
of training samples should not be less than the total number 
Wof degrees of freedom (weights) of the network [14]: 

S>W =M(L + 1)+N(M +1) (3) 

Therefore, the dimensioning of the network becomes a 
keypoint of the whole algorithm and should meet the need of 
a satisfying trade-off between network's complexity and its 
generalization capabilities. Noting that the transmitted signal 
has two local maxima and one minimum, and that any further 
peak implies the presence of multiple reflections, we opted to 
employ as input features the first K=4 peaks, which would 
feed a 8-8-1 MLP, thus with the same number of units for 
the input and hidden layer. 



19 



ACE EE 



ACEEE Int. J. on Information Technology, Vol. 02, No. 01, March 2012 






(3 PR Data 



Peak Detector 

Figure 5. Al 



M,LP 



The full processing scheme of the proposed algorithm for 
the NDT of surface layers is shown in Fig. 5. After the 
acquisition of GPR data, the radargram traces are processed 
by a peak detector, which extracts the amplitudes and 

corresponding times {P^,^}. j of the zero-derivative points 

of the signals. Then, the outputs of the peak detector are 
stored in a data vector and fed to the MLP, which finally 
returns the dielectric permittivity of the layer under test. 

III. Results 

The performance assessment of the algorithm described 
in the previous section has been carried out over a dataset of 
200 different scenarios, simulated by means of GprMax, a 
software based on the FDTD numerical method [15]. 



orithm workflow 

depicted in Fig. 7 - do not exhibit the expected increasing 
monotone behavior (the higher the permittivity, the greater 
the backscattering), but fluctuate around a mean value. This 
implies that the same peak value of the electric field could be 
extracted in distinct signals corresponding to different 
permittivities (e.g. P } [s=12] H" P } [s=15j), within the same 
signal at different positions (e.g. P } [e=5] H" PJs L =5]), or 
both (e.g. P } [s=5] H" PJs L =7]). Hence the importance of 




Figure 6. Rough surface model: each discontinuity is generated 
imposing a vertical and horizontal shift from two reference planes. 

To reduce computational costs, we have set a bi-dimensional 
simulation domain: thanks to the specific problem geometry, 
this approximation would not affect the validity of the test. 
According to the requirements of the algorithm, we have 
simulated a monostatic GPR, by modelling the TX as a current 
wire fed by a differentiated gaussian pulse of central frequency 
2 GHz, and the RX as an ideal probe placed at the same 
position of the TX. As regards the geophysical properties of 
the layers, we have considered lossless media with 
permittivity e l ranging from 2 to 20 and generated roughness 
profiles considering random vertical and horizontal deviations 
(x y and x H ) from suitable reference planes within the interval 
[0.5; 1.5] cm (see Fig. 6). Such interval is consistent with the 
spatial resolution supplied by the system. The whole dataset 
has been first processed in order to extract the 8 input features 
needed by the MLP and then split into two separated groups 
employed as training and test set, respectively. It is interesting 
to notice that the four series of amplitudes P p P 2 , P ; , P — 
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Figure 7. Electric field maxima/minima normalized values extracted 
within GPR data from different scenarios. PI: blue line; P2: green 
line; P3: red line; P4: cyan line. 

TABLE I. 

Absolute and relative errors for S=100. 
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feeding the MLP not only with amplitude values but also 
with the relative delays. According to Eq. 3 we expect that 
the network would show good generalization capabilities with 
more than 80 training samples. Table I shows the percentage 
relative error 1 between the network's output and the actual 
permittivity values in the case of 5=700. It can be seen that 
the MLP performs very well, with a noticeable mean error of 
around 2.50% and a maximum error which remains below 10%. 
In absolute terms, it means that the dielectric constant of a 
layer can be reconstructed with an average accuracy of +0.28, 
while the maximum spread should not exceed +1 .7. In addition, 
the graph in Fig. 8 illustrates that the mean error for each 
value of permittivity exhibits a satisfactorily quite uniform 
distribution of the error within the whole domain. Although 
the obtained results can be in general considered very 
promising, a training set of a hundred samples could, in 

—ACEEE 



ACEEE Int. J. on Information Technology, Vol. 02, No. 01, March 2012 



practice, become a limit for the effectiveness of the method. 
The main inconvenience in supervised approaches is, in fact, 
the necessity of finding a sufficient number of on-field 
examples in case no auxiliary data can be used for the learning 
phase. For this reason, we measured the network's 
performances by feeding it with an increasing number of I/O 
patterns, from 10 to 100, in order to evaluate the minimum 
dimension S which could guarantee an adequate level of 
accuracy. 




Figure 8. Mean relative error for each value of e l . 

We actually found that the algorithm provides acceptable 
results also in case of very small training samples (see Tab 
II). Even in the worst case of S=10, it is indeed affected by a 
mean error of 14%, which is still satisfactory if we consider 
that it denotes an absolute mismatch with respect to the ex- 
pected value of about 1.5. Maximum absolute errors seem, 
nevertheless, too high to justify the use of very small 



training sets. A further aspect that should be mentioned re- 
gards the complementary trend shown by the error of the 
training and test set, which describe two opposite curves 
converging to a unique value (Fig. 9). This confirms that a 
proper fitting of the neural network yields good generaliza- 
tion performances (which exactly means that the expected 
error on unknown patterns should be very close the one ob- 
tained during the learning phase). To complete the perfor- 
mance assessment and test the method under additional re- 
alistic hypotheses, we evaluated the response of the MLP to 
e.m signals generated by planar surfaces and lossy media. In 
the former case, further simulations have been carried out 
employing the already mentioned horizontal reference plane 
as upper boundary of the layer. In the latter case, instead, we 
have generated both rough and planar media with a conduc- 
tivity s L ranging from 0.01 to 10 S/m. 
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Figure 9. Mean relative error for increasing values of S. 



Table ii. 

Absolute and relative errors for increasing values of S. 
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As illustrated in Fig. 10a, in case of planar surfaces network 
performances are comparable to those of the rough model 
(mean error of 2.69%), while in case of lossy layers the recon- 
struction is accomplished with a relative error of even 150%. 
Such behavior can be explained noticing that flat interface 
response is basically a particular case of rough response, so 
we can expect that the scattered signals would have ampli- 
tudes matching those employed within the training phase, 
and therefore properly invertible by the neural network. 
Lossy media, instead, suffer a twofold drawback: non-zero 
values of conductivity induce either a strong distortion of 
the backscattered signals, causing a considerable deforma- 
tion of the wavelets (not recognizable by the MLP) or out-of- 

1 the percentage relative error of the output y with respect to the 
reference value y* has been computed according to the following 
formula: 

\y" - y\ 

err(y) = 1 , , 1 x 100 

\y\ 
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domain amplitude values (not even manageable by the MLP). 
To improve performances it is possible to replace a few 
samples of the training set with input features extracted from 
lossy surface, and then re-train the MLP, in order to show the 
network how to recognize this kind of signals. A strong re- 
duction of the errors can be indeed observed (see Fig. 10b), 
from 145% to 32%, even though, in absolute terms, the overall 
performances remain not completely satisfactory. 

TV. Conclusions 

In this paper a (semi-)automatic ANN-based processing 
chain for GPR data analysis has been presented. In particular, 
we have shown that it is possible to feed a Multi-Layer 
Perceptron with a suitable set of input features extracted from 
the radargrams acquired during NDT in order to determine 
the permittivity of a surface layer. Supplied with a sufficient 
number of training samples, the network performs very well 
in presence of both planar and rough surfaces. Actually, 
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satisfying generalization capabilities have been shown also 
when small training sets are provided. Such behavior suggests 
a possible over-parametrization of the neural network, whose 
design could be in case revised in order to reduce the overall 
complexity. Amain drawback that has emerged from the analysis 
regards the limitations in handling signals scattered by lossy 
media: the distortion introduced by conductivity can induce 
wavelet's deformation and out-of-domain values, yielding to 
not completely satisfying performances. 
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Figure 10. Mean relative error for rough, planar and lossy media. 

Future works will be therefore devoted to solve the above- 
mentioned open issues and will try to extend the proposed 
solution to multi-layered media. A particular concern will also 
regard the processing of noisy signals. 
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